Estimation of Generating Processes of Strings Represented with Patterns and Substitutions

نویسندگان

  • Keisuke Otaki
  • Akihiro Yamamoto
  • Tim Oates
  • Otaki Yamamoto
چکیده

We formalize generating processes of strings based on patterns and substitutions, and give an algorithm to estimate a probability mass function on substitutions, which is an element of processes. Patterns are non-empty sequences of characters and variables. Variables indicate unknown substrings and are replaced with other patterns by substitutions. By introducing variables and substitutions, we can deal with the difficulty of preparing production rules in generative grammar and of representing context-sensitivity with them. Our key idea is to regard sequences of substitutions as generations of strings, and to give probabilities of substitutions like PCFG. In this study, after giving a problem to estimate a probability mass function from strings based on our formalization, we solve it by the Passing algorithm that runs in an iterative manner. Our experimental results with synthetic strings show that our method estimates probability mass functions with sufficient small errors.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Estimation of Error-Correcting Parameters

Error-Correcting (EC) techniques allow for coping with divergences in pattern strings with regard to their “standard” form as represented by the languageL accepted by a regular or context-free grammar. There are two main types of EC parsers: minimum-distance and stochastic. The latter apply the maximum likelihood rule: classification into the classes of the strings in L that have the greatest p...

متن کامل

THE COMPARISON OF TWO METHOD NONPARAMETRIC APPROACH ON SMALL AREA ESTIMATION (CASE: APPROACH WITH KERNEL METHODS AND LOCAL POLYNOMIAL REGRESSION)

Small Area estimation is a technique used to estimate parameters of subpopulations with small sample sizes.  Small area estimation is needed  in obtaining information on a small area, such as sub-district or village.  Generally, in some cases, small area estimation uses parametric modeling.  But in fact, a lot of models have no linear relationship between the small area average and the covariat...

متن کامل

بررسی تکنیک بافت دست بافت های عشایر فارس بافته های منسوخ شده

This study examines weaving techniques in three rare samples of Ghashghaei tribe hand-woven artifacts. Two first samples, “Shishe dermeh baafi” and “O’ei baafi” (2 laaye baafi) we found similarities and differences in the weaving structures, which are: Similarities: Both are hand-woven products woven through flat weaving, having two opposite colours warp and woof. ...

متن کامل

روتنوکوپراتها: میدان رقابت ابررسانایی و مغناطیس

 We have compared the structural, electrical, and magnetic properties of Ru(Gd1.5-xPrx)Ce0.5Sr2Cu2O10-δ (Pr/Gd samples) with x = 0.0, 0.01, 0.03, 0.033, 0.035, 0.04, 0.05, 0.06, 0.1 and RuGd1.5(Ce0.5-xPrx)Sr2Cu2O10-δ (Pr/Ce samples) with x = 0.0, 0.01, 0.03, 0.05, 0.08, 0.1, 0.15, 0.2 prepared by the standard solid-state reaction technique with RuGd1.5(GdxCe0.5-x) Sr2Cu2O10-δ (Gd/Ce samples) wi...

متن کامل

Modelling improvisatory and compositional processes

An application of formal languages to the representation of musical processes is introduced. Initial interest was the structure of improvisation in North Indian tabla drum music, for which experiments have been conducted in the field as far back as 1983 with an expert system called the Bol Processor, BP1. The computer was used to generate and analyze drumming patterns represented as strings of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012